-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize IntSet.Bin #998
Optimize IntSet.Bin #998
Conversation
* Replace the separate Prefix and Mask Int fields in the Bin constructor with a single Int field which contains both merged together. This reduces the memory required by a Bin from 5 to 4 words, at the cost of more computations (which are cheap bitwise ops) being necessary for certains operations. This follows a similar change done for IntMap.Bin. * Benchmarks show that runtimes for most operations remain unchanged or decrease by a small amount (<10%). As expected, allocations are consistently lower by 11-16% for all set operations that have to make O(log n) allocations. * The functions and types used by both IntSet and IntMap have been moved into a IntTreeCommons module.
This isn't related to your changes, but it looks like the |
@@ -334,7 +363,7 @@ null _ = False | |||
size :: IntSet -> Int | |||
size = go 0 | |||
where | |||
go !acc (Bin _ _ l r) = go (go acc l) r | |||
go !acc (Bin _ l r) = go (go acc l) r | |||
go acc (Tip _ bm) = acc + bitcount 0 bm | |||
go acc Nil = acc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does bitcount
take two arguments? Should we just use popCount
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does
bitcount
take two arguments?
This seems to be left over from when bitcount
was a loop and the first parameter was the accumulator: 4952822#diff-06e022e2fe36764ea9baec24a03a8186d708cc561677a2e6583f08fe180ca073L73-L76
What I don't understand is why it is acc + bitcount 0 bm
rather than bitcount acc bm
😄
Should we just use
popCount
directly?
Probably. Might be better in a different PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a totally reasonable "by the way" for this one, if you're so inclined. It strictly improves readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we should get rid of the note citing a source for the highest bit algorithm that we no longer use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we were considering removing the Nat
stuff in #995, I think it would a better fit with those changes. Same for the obsolete note.
-- * All keys in a Bin start with the Bin's shared prefix. | ||
-- * All keys in the Bin's left child have the Prefix's mask bit unset. | ||
-- * All keys in the Bin's right child have the Prefix's mask bit set. | ||
prefixOk :: IntSet -> Property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything checking the prefixes of Bin
s other than the root. Shouldn't this be called prefixesOk
, and perform a recursive test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For test performance reasons, I suspect we want to consider validity of the prefix of a Bin
relative to that of its parent when making the test recursive, treating the (peculiarly simple) case of the root specially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 🤦
For test performance reasons, I suspect we want to consider validity of the prefix of a Bin relative to that of its parent when making the test recursive, treating the (peculiarly simple) case of the root specially.
Is the trouble really worth it?
$ cabal run intset-properties
...
All 60 tests passed (0.21s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the 0.21s, no. But why does it run for such a short time? Can we make it run more tests? Larger tests? Ideally I want to use more efficient tests and also get that testing time up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separately, I suspect that a relative test is likely to help give more useful information in case of failure, though I don't know that for sure. Regardless, I won't block up this PR for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it run more tests?
This would be the easiest, we can increase the number of quickcheck runs.
Ideally I want to use more efficient tests and also get that testing time up.
Well, if it helps catch errors. Don't know how one would judge that, except maybe coverage. Otherwise it is a waste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Data.Sequence
, that's helpful for dealing with "special casing" in some parts of the code. Maybe it's overkill here.
t -> | ||
valid t .&&. | ||
toAscList t === List.sort (nub ((List.intersect) (xs) (ys))) | ||
toAscList t === (toAscList xs List.\\ toAscList ys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data.List.Ordered.minus
would be faster than \\
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean from data-ordlist? Surely this is not worth a new dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, from that package. I imagine we can copy what we like, though I haven't checked its license to be sure. My only real concern about the dependency is that the package doesn't seem to be actively maintained, so it might run into annoying issues with base
changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the tests are quite fast (as mentioned the other comment), this seems like something that shouldn't cause too much worry.
Besides, a new function/dependency is another chance to introduce a bug. Data.List is a little more trustworthy in that regard, I'd say.
case intersection xs ys of | ||
t -> | ||
valid t .&&. | ||
toAscList t === (toAscList xs `List.intersect` toAscList ys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, Data.List.Ordered.isect
should be faster than List.intersect
.
@treeowl, anything I can do here? |
| otherwise = go l r | ||
go def (Bin p l r) | nomatch x p = if x < unPrefix p then unsafeFindMax def else unsafeFindMax r | ||
| left x p = go def l | ||
| otherwise = go l r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't related to your changes, per se, but we use somewhat different algorithms for lookup
in IntMap
than in member
for IntSet
, and the reasoning doesn't seem to be documented in the source. As I recall, IntMap
lookup is (as of relatively recently) optimized for the common case that the key is present in the map, whereas IntSet
membership is optimized for the (presumably?) common case that the element is not in the set. What's the right reasoning for lookupLT
and lookupGT
here? I don't have much of a clue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I recall,
IntMap
lookup is (as of relatively recently) optimized for the common case that the key is present in the map...
Yes, appears to be so due to #800
What's the right reasoning for
lookupLT
andlookupGT
here? I don't have much of a clue.
I'm not sure either. Note that the lookup
logic (using zero
, before #995) could not be directly applied to lookup{L,G}{T,E}
. It checked specific bits and was not a binary search. However, the current logic with left
can be applied without the nomatch
check. This seems worth checking out, but we don't have to do it here I think.
I'm dealing with a family medical emergency this week. I hope to give this a final review shortly. |
@treeowl I hope you are able to find time for this. I would really like to finish this work. |
Yes, I'm so sorry about the delay. Problems continue on my end, but that's no reason to hold you up. Merged! Will you try the |
Thank you!
This was the |
No, I'm just incredibly tired from what's been going on at home.
…On Tue, Jun 4, 2024, 9:43 AM Soumik Sarkar ***@***.***> wrote:
Thank you!
Will you try the IntSet now?
This was the IntSet PR. Are you thinking of something else?
—
Reply to this email directly, view it on GitHub
<#998 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7IFP3VXBJOJRKYWDLLZFXAAHAVCNFSM6AAAAABF2HCEOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBXGU3TGOBTGQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
I hope things get better 🙂 |
Prefix
andMask
Int
fields in theBin
constructor with a singleInt
field which contains both merged together. This reduces the memory required by aBin
from 5 to 4 words, at the cost of more computations (which are cheap bitwise ops) being necessary for certains operations. This follows a similar change done forIntMap.Bin
(Optimize IntMap.Bin #995).IntSet
andIntMap
have been moved into aIntTreeCommons
module.Fixes #991.
Memory
Concretely, this reduces the memory required by an
IntSet
by ~12.5%.Calculations: For a tree with n
Tip
s, eachTip
s costs 3 words and there aren-1
Bin
s each costing 5 words before this change and 4 words after. So we save about 1 out of 8 words.Compatibility
This PR makes breaking changes to
IntSet
internals which is reflected in the exports of the internal moduleData.IntSet.Internal
. Due to the introduction ofIntTreeCommons
, some exports ofData.IntMap.Internal
are moved toIntTreeCommons
. There is no change in the exports of any other module.Benchmarks
Benchmarks done with GHC 9.6.3.
Last updated: 080dde1
Benchmark command:
cabal run <target> -- --csv <csv> +RTS -T
intset-benchmarks
set-operations-intset